-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC][BPF] Report Unreachable Behavior from IR #126858
base: main
Are you sure you want to change the base?
Conversation
✅ With the latest revision this PR passed the C/C++ code formatter. |
✅ With the latest revision this PR passed the undef deprecator. |
ae4596e
to
3526fd3
Compare
Value *RetValue = I->getReturnValue(); | ||
// PoisonValue is a special UndefValue where compiler intentionally to | ||
// poisons a value since it shouldn't be used. | ||
if (!RetValue || isa<PoisonValue>(RetValue) || !isa<UndefValue>(RetValue)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like undef is deprecated.
Do we have to check it or PoisonValue is enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some extreme case, e.g.,
int foo(void) {
int i[2];
return i[1];
}
The eventual IR will look like
define dso_local i32 @foo() #0 {
ret i32 undef
}
For such cases, checking !isa<PoisonValue>(RetValue)
is not enough, we should check !isa<UndefValue>(RetValue)
to ensure to report something wrong to user.
But such case should be really rare and it is easier for user to find out what is going on.
The prog Marc Suñé reported does not need the above check. Since upstream doesn't like to involve 'undef', I will remove the above checking and also remove the simple test which results in IR 'ret i32 undef' in the next revision.
3526fd3
to
9671d28
Compare
clang-format still does not like 'undef' in the test. Will try not to have 'undef' in IR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LLVM/Clang generally do not allow the use of backend-generated warnings, because they produce unintelligible, optimization-dependent false-positives. There are some specific exceptions, but what you're doing here is basically exactly the kind of backend warning we want to avoid.
Does BPF have any kind of support for trapping or otherwise indicating an error? There is existing support to compile unreachable to a trap instruction (TrapUnreachable), but from a quick look, it doesn't seem like there is any BPF instruction this can be mapped to?
} | ||
|
||
dbgs() << "WARNING: unreachable in func " << F.getName() | ||
<< ", due to uninitialized variable?\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dbgs() is not a suitable method to report user-visible warnings. This needs to go through DiagnosticInfo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I just updated a change to use DiagnosticInfo.
9671d28
to
437a5ec
Compare
The backend warning here is BPF specific as several conditions are used to check before issuing the warning.
No, BPF arch does not have trapping insn or another other insn similar to trap/error. I am not sure whether bpf should have trap insn or not. For a prog passing verifier, it should not trap in the kernel. Let us say we do introduce trap insn in BPF ISA, and llvm indeed inserts one which implies some path will hit 'trap' so probably we should error out at compile time. But it has slightly chance compiler may have a false positive, so I think a warning seems more appropriate. |
In my latest change, I remove the selftest due to clang-format does not like 'undef' in the IR. If anybody wants to run it at llvm/test/CodeGen/BPF, the gist link is |
Update a new revision with the following key changes:
|
943e279
to
1c19bae
Compare
} | ||
|
||
F.getContext().diagnose( | ||
DiagnosticInfoGeneric(Twine("unreachable in func ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that reporting file:line would still be useful in some cases.
If the error could say that in function FOO from line X to line Y that code was deleted as unreachable
that would help users to address the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, if debuginfo (-g) is available, the error message will be something like
error: in func repro from line 155 to the end of func that code was deleted as unreachable,
due to uninitialized variable? try -Wuninitialized?
Without debuginfo, the error message will be like
error: in func repro that code was deleted as unreachable,
due to uninitialized variable? try -Wuninitialized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error: in func repro from line 155 to the end of func that code was deleted as unreachable,
due to uninitialized variable? try -Wuninitialized?
this looks better, maybe add '' around function name like we do elsewhere and don't shorten "function" to "func" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new message looks like:
error: in function "repro" from line 155 to the end of function that code was deleted as unreachable.
due to uninitialized variable? try -Wuninitialized?
1c19bae
to
c9ba781
Compare
Marc Suñé (Isovalent, part of Cisco) reported an issue where an uninitialized variable caused generated bpf prog binary code not working as expected. The reproducer is in [1] where the flags “-Wall -Werror” are enabled, but there is no warning and compiler may take advantage of uninit variable to do aggressive optimization. In discussion [2], various approaches are discussed, e.g., improving compiler to detect undefined behavior due to uninitialized variables, trying to use ubsan (-fsanitize=undefined), and making -ftrivial-auto-var-init=zero as the bpf default flags. I tried [3] with -ftrivial-auto-var-init=zero and eventually we decided no-go since first it may introduce performance regression and second the prog may still be wrong if the prog expects a non-zero value. The ubsan apprach seems not working as well since it involves runtime callback func ([4]). The approach here is not to do complicate compiler analysis to detect whether where is undef behavior which may impact final codegen. Rather, we relies on compiler to do its normal transformation and at later IR passes stage, a BPF backend pass is inserted to check whether undef behavior is in IR or not. Note that if undef behavior indeed impacts codes, the compiler will discard those related codes with simple 'undef' or 'unreachable'. For example, for the case [1], before SCCPPass, the IR looks like ``` define dso_local i32 @repro(ptr noundef %0) #0 section "classifier" { %2 = alloca %struct.ipv6_opt_hdr, align 8 %3 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @repro.____fmt, i32 noundef 6) llvm#2 %4 = tail call ptr asm sideeffect "$0 = *(u32 *)($1 + $2)", "=r,r,i"(ptr %0, i64 76) llvm#2, !srcloc !3 %5 = ptrtoint ptr %4 to i64 %6 = trunc i64 %5 to i32 %7 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @icmp6_ndisc_validate.____fmt, i32 noundef 23) llvm#2 call void @llvm.lifetime.start.p0(i64 2, ptr nonnull %2) llvm#2 %8 = getelementptr inbounds nuw i8, ptr %2, i64 1 switch i8 undef, label %51 [ i8 59, label %56 i8 44, label %57 i8 0, label %9 i8 43, label %9 i8 51, label %9 i8 60, label %9 ] 9: ; preds = %1, %1, %1, %1 %10 = sub i32 40, %6 ... ``` Note that 'undef' is used for switch key due to one of early pass LoopFullUnrollPass. After SCCPPass: ``` efine dso_local i32 @repro(ptr noundef %0) #0 section "classifier" { %2 = alloca %struct.ipv6_opt_hdr, align 8 %3 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @repro.____fmt, i32 noundef 6) llvm#2 %4 = tail call ptr asm sideeffect "$0 = *(u32 *)($1 + $2)", "=r,r,i"(ptr %0, i64 76) llvm#2, !srcloc !3 %5 = ptrtoint ptr %4 to i64 %6 = trunc i64 %5 to i32 %7 = tail call i64 (ptr, i32, ...) inttoptr (i64 6 to ptr)(ptr noundef nonnull @icmp6_ndisc_validate.____fmt, i32 noundef 23) llvm#2 call void @llvm.lifetime.start.p0(i64 2, ptr nonnull %2) llvm#2 %8 = getelementptr inbounds nuw i8, ptr %2, i64 1 unreachable } ``` For another example, ``` $ cat t.c int foo() { int i[2]; return i[1]; } ``` Before SROAPass pass, ``` define dso_local i32 @foo() #0 { %1 = alloca [2 x i32], align 4 call void @llvm.lifetime.start.p0(i64 8, ptr %1) llvm#2 %2 = getelementptr inbounds [2 x i32], ptr %1, i64 0, i64 1 %3 = load i32, ptr %2, align 4, !tbaa !3 call void @llvm.lifetime.end.p0(i64 8, ptr %1) llvm#2 ret i32 %3 } ``` After SROAPass pass, ``` define dso_local i32 @foo() #0 { ret i32 undef } ``` Besides the above two test cases, the following three patterns are also covered: - It is possible llvm may generate codes where a default branch to 'unreachable' location. Ignore such 'unreachable' instances. See [5] or some comments in [2]. - Handle pattern like __bpf_unreachable (defined in bpf_helpers.h). - Functions with naked attribute will have 'unreachable' at the end of function. Ignore such functions. Tested with bpf selftests and there are no warnings issued. [1] https://github.com/msune/clang_bpf/blob/main/Makefile#L3 [2] https://discourse.llvm.org/t/detect-undefined-behavior-due-to-uninitialized-variables-in-bpf-programs/84116?u=yonghong-song [3] llvm#125601 [4] https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/ubsan/ubsan_interface.inc [5] https://lore.kernel.org/lkml/[email protected]/
Upstream does not like to check undef value and clang-format will fail due to this. Let us remove checking for returning undef value. A related test is also removed.
The test undef-sccp.ll still uses 'undef' in the IR and clang-format complains it. In this particular case, 'undef' is generated in sroa and it needs a lot of other passes to reach sccp. So let us remove this test. The test itself can be accessed in https://gist.github.com/yonghong-song/11be8603ad9422f418f60b41beded047 which can be tested in llvm/test/CodeGen/BPF directory. DiagnosticInfo() is the recommended interface for warnings comparing to dbgs().
Report an error by default. Add a new flag -bpf-disable-check-unreachable-ir to disable checking. Depend on whether debuginfo is available or not, the error message will look like in func <func> from line <line num> to the end of func that code was deleted as unreachable, due to uninitialized variable? try -Wuninitialized? or in func <func> that code was deleted as unreachable, due to uninitialized variable? try -Wuninitialized?
c9ba781
to
85bdc2c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though this is BPF-specific, it is introducing constraints on target-independent transforms: specifically, any transform which may introduce unreachable code is illegal.
As such, I think this needs an RFC to determine how to go forward. The needs to describe the general impact on transformations, and whether there are any existing transforms impacted by the constraint.
@efriedma-quic I marked this patch as RFC. I think the key thing is about when 'unreachable' is introduced today in middle end and what is the expected scenario 'unreachable' could be introduced in the future. For the 'unreachable' introduced in the future, I guess it should be okay. The BPF can be adjusted as necessary, similar to other transformations. |
On a conventional target, in general, code with undefined behavior is fine, as long as it doesn't actually execute at runtime. People write source code based on this, and transforms duplicate code based on this. And this makes diagnostics in the backend impractical. I understand that BPF operates under different constraints than conventional targets... in particular, BPF-C is a subset of C. It only allows programs that pass the BPF verifier. So maybe you don't run into exactly the same issues with late diagnostics. But, I'm not sure what the implications are here for transforms. For example, can JumpThreading introduce an "unreachable" that triggers the error here? If so, do we need to restrict JumpThreading on BPF? |
JumpThreading works for BPF target as well and the middle end optimization does not need to restrict JumpThreading for BPF target. The bpf backend pass implemented in this patch will handle such code patterns (BPF backend will do some analysis if necessary). If in the future, middle end optimization has new cases which introduce unreachable, BPF backend will try to handle handle such new patterns if necessary and this should be fine as they typically will be in the same release. So in summary, middle end optimization does not need to do any special things for BPF backend. |
cc @WenleiHe |
I was curious if there are other cases besides switch instruction, where Given this, I tend to agree that a runtime trap would be a simpler solution. E.g. something along the lines libbpf does here, where it inserts a call to unknown helper function. In kernel Verifier can be changed to report something about undefined behaviour trap to simplify debugging. BPF backend can be changed to avoid clobbering registers because of this special trap call. Just my 5 cents. |
Could you share the details about these experiments? I would like to see whether I missed anything which we should cover?
Regarding to runtime trap, I actually examined this as well before going to compiler approach. That approach is to change 'unreachable' to some newly created trap insn and later on when verifier reaches these trap insn, verification will fail. But I did a few examples and found that only limited patterns so I prefer to use the compiler approach. Note that the 'unreachable' checking is done at roughly the end of middle end optimizaiton (i.e. beginning of backend IR passes). So let us gather more examples here before considering going to bpf verifier to deal with these 'unreachable' traps. Note that there is risk that these 'unreachable' traps may be rejected by verifier but actually it may be safe because of verifier some in-precise analysis. So the best way is to do at runtime, but it has its own complication.
So my suggestion is to let us do some thorough analysis at compile time. Note that it is always better for compiler to flag something first. If eventually it turns out there are too many patterns which may have fake unreachables, then we could consider run-time approach then.
|
I did not do any experiments, tried to analyze places where One option would be to copy this transformation to x86 backend and e.g. compile Linux kernel or some other big project, to see which code patterns introduce |
Thanks. It would be great to collect some statistics. I only tried with bpf programs in kernel bpf selftests. Maybe trying with other architectures can expose more patterns. |
I also discussed with @WenleiHe a little bit about this. Another solution is to extend 'unreachable' insn in LLVM. For example, we could have 'unreachable' like below
With carried and passed from various optimization, downstream can reason about 'unreachable' insn. This will make BPF backend pass easier to only trigger error for due to undef bahavior. WDYT? Is this a reasonable approach? |
Marc Suñé (Isovalent, part of Cisco) reported an issue where an uninitialized variable caused generated bpf prog binary code not working as expected. The reproducer is in [1] where the flags “-Wall -Werror” are enabled, but there is no warning and compiler may take advantage of uninit variable to do aggressive optimization.
In discussion [2], various approaches are discussed, e.g., improving compiler to detect undefined behavior due to uninitialized variables, trying to use ubsan (-fsanitize=undefined), and making -ftrivial-auto-var-init=zero as the bpf default flags.
I tried [3] with -ftrivial-auto-var-init=zero and eventually we decided no-go since first it may introduce performance regression and second the prog may still be wrong if the prog expects a non-zero value. The ubsan apprach seems not working as well since it involves runtime callback func ([4]).
The approach here is not to do complicate compiler analysis to detect whether where is undef behavior which may impact final codegen. Rather, we relies on compiler to do its normal transformation and at later IR passes stage, a BPF backend pass is inserted to check whether undef behavior is in IR or not. Note that if undef behavior indeed impacts codes, the compiler will discard those related codes with simple 'undef' or 'unreachable'.
For example, for the case [1], before SCCPPass, the IR looks like
Note that 'undef' is used for switch key due to one of early pass LoopFullUnrollPass. After SCCPPass:
Besides the above case, the following three patterns are also covered:
A bpf flag
-bpf-disable-check-unreachable-ir
is introduced to disable this checking.Tested with bpf selftests and there are no errors issued.
[1] https://github.com/msune/clang_bpf/blob/main/Makefile#L3
[2] https://discourse.llvm.org/t/detect-undefined-behavior-due-to-uninitialized-variables-in-bpf-programs/84116?u=yonghong-song
[3] #125601
[4] https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/ubsan/ubsan_interface.inc
[5] https://lore.kernel.org/lkml/[email protected]/